Psychoacoustics is the branch of psychophysics involving the scientific study of the perception of sound by the human auditory system. It is the branch of science studying the psychological responses associated with sound including noise, speech, and music. Psychoacoustics is an interdisciplinary field including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.
The inner ear, for example, does significant signal processing in converting sound into neural stimuli, this processing renders certain differences between waveforms imperceptible. Data compression techniques, such as MP3, make use of this fact. In addition, the ear has a nonlinear response to sounds of different intensity levels; this nonlinear response is called loudness. Telephone networks and audio noise reduction systems make use of this fact by nonlinearly compressing data samples before transmission and then expanding them for playback. Another effect of the ear's nonlinear response is that sounds that are close in frequency produce phantom beat notes, or intermodulation distortion products.
Human perception of audio signal time separation has been measured to be less than . This does not mean that frequencies above are audible, but that time discrimination is not directly coupled with frequency range.
Frequency resolution of the ear is about 3.6 Hz within the octave of That is, changes in pitch larger than 3.6 Hz can be perceived in a clinical setting. However, even smaller pitch differences can be perceived through other means. For example, the interference of two pitches can often be heard as a repetitive variation in the volume of the tone. This amplitude modulation occurs with a frequency equal to the difference in frequencies of the two tones and is known as beating.
The semitone scale used in Western musical notation is not a linear frequency scale but logarithmic. Other scales have been derived directly from experiments on human hearing perception, such as the mel scale and Bark scale (these are used in studying perception, but not usually in musical composition), and these are approximately logarithmic in frequency at the high-frequency end, but nearly linear at the low-frequency end.
The intensity range of audible sounds is enormous. Human eardrums are sensitive to variations in sound pressure and can detect pressure changes from as small as a few (μPa) to greater than . For this reason, sound pressure level is also measured logarithmically, with all pressures referenced to (or ). The lower limit of audibility is therefore defined as , but the upper limit is not as clearly defined. The upper limit is more a question of the potential to cause noise-induced hearing loss.
A more rigorous exploration of the lower limits of audibility determines that the minimum threshold at which a sound can be heard is frequency dependent. By measuring this minimum intensity for test tones of various frequencies, a frequency-dependent absolute threshold of hearing (ATH) curve may be derived. Typically, the ear shows a peak of sensitivity (i.e., its lowest ATH) between , though the threshold changes with age, with older ears showing decreased sensitivity above 2 kHz.
The ATH is the lowest of the equal-loudness contours. Equal-loudness contours indicate the sound pressure level (dB SPL), over the range of audible frequencies, that are perceived as being of equal loudness. Equal-loudness contours were first measured by Fletcher and Munson at Bell Labs in 1933 using pure tones reproduced via headphones, and the data they collected are called Fletcher–Munson curves. Because subjective loudness was difficult to measure, the Fletcher–Munson curves were averaged over many subjects. Robinson and Dadson refined the process in 1956 to obtain a new set of equal-loudness curves for a frontal sound source measured in an anechoic chamber. The Robinson-Dadson curves were standardized as in 1986. In 2003, was revised using data collected from 12 international studies.
It can explain how a sharp clap of the hands might seem painfully loud in a quiet library but is hardly noticeable after a car backfires on a busy, urban street. This provides great benefit to the overall compression ratio, and psychoacoustic analysis routinely leads to compressed music files that are one-tenth to one-twelfth the size of high-quality masters, but with discernibly less proportional quality loss. Such compression is a feature of nearly all modern lossy audio compression formats. Some of these formats include Dolby Digital (AC-3), MP3, Opus, Ogg Vorbis, AAC, WMA, MPEG-1 Layer II (used for digital audio broadcasting in several countries), and ATRAC, the compression used in MiniDisc and some Walkman models.
Psychoacoustics is based heavily on human anatomy, especially the ear's limitations in perceiving sound as outlined previously. To summarize, these limitations are:
A compression algorithm can assign a lower priority to sounds outside the range of human hearing. By carefully shifting bits away from the unimportant components and toward the important ones, the algorithm ensures that the sounds a listener is most likely to perceive are most accurately represented.
Irv Teibel's Environments series LPs (1969–79) are an early example of commercially available sounds released expressly for enhancing psychological abilities.
Licklider wrote a paper entitled "A duplex theory of pitch perception".
Psychoacoustics is applied within many fields of software development, where developers map proven and experimental mathematical patterns in digital signal processing. Many audio compression codecs such as MP3 and Opus use a psychoacoustic model to increase compression ratios. The success of Home audio for the reproduction of music in theatres and homes can be attributed to psychoacoustics
Automobile manufacturers engineer their engines and even doors to have a certain sound.
|
|